Overview

Dataset statistics

Number of variables20
Number of observations627868
Missing cells1098738
Missing cells (%)8.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory95.8 MiB
Average record size in memory160.0 B

Variable types

Categorical10
Numeric7
Unsupported2
Boolean1

Warnings

State has constant value "Cross River" Constant
Regimen has a high cardinality: 88 distinct values High cardinality
PHARMACY_ID is highly correlated with PATIENT_ID and 1 other fieldsHigh correlation
PATIENT_ID is highly correlated with PHARMACY_ID and 1 other fieldsHigh correlation
FACILITY_ID is highly correlated with PHARMACY_ID and 1 other fieldsHigh correlation
AFTERNOON is highly correlated with EVENINGHigh correlation
EVENING is highly correlated with AFTERNOONHigh correlation
PHARMACY_ID is highly correlated with PATIENT_IDHigh correlation
PATIENT_ID is highly correlated with PHARMACY_ID and 1 other fieldsHigh correlation
FACILITY_ID is highly correlated with PATIENT_IDHigh correlation
PATIENT_ID is highly correlated with FACILITY_IDHigh correlation
FACILITY_ID is highly correlated with PATIENT_IDHigh correlation
DMOC_TYPE is highly correlated with PHARMACY_ID and 1 other fieldsHigh correlation
FACILITY_ID is highly correlated with PHARMACY_ID and 3 other fieldsHigh correlation
ADHERENCE is highly correlated with RegimenHigh correlation
PHARMACY_ID is highly correlated with DMOC_TYPE and 4 other fieldsHigh correlation
Regimen Line is highly correlated with RegimenHigh correlation
L.G.A is highly correlated with FACILITY_ID and 3 other fieldsHigh correlation
Regimen is highly correlated with ADHERENCE and 2 other fieldsHigh correlation
EVENING is highly correlated with AFTERNOON and 1 other fieldsHigh correlation
AFTERNOON is highly correlated with EVENING and 1 other fieldsHigh correlation
PATIENT_ID is highly correlated with FACILITY_ID and 3 other fieldsHigh correlation
Facility Name is highly correlated with DMOC_TYPE and 4 other fieldsHigh correlation
MORNING is highly correlated with EVENING and 1 other fieldsHigh correlation
ADR_IDS is highly correlated with RegimenHigh correlation
ADR_SCREENED has 10322 (1.6%) missing values Missing
ADR_IDS has 627858 (> 99.9%) missing values Missing
DMOC_TYPE has 460552 (73.4%) missing values Missing
DURATION is highly skewed (γ1 = 606.5115674) Skewed
MORNING is highly skewed (γ1 = 148.0439747) Skewed
EVENING is highly skewed (γ1 = 324.8214677) Skewed
BODY_WEIGHT is highly skewed (γ1 = 138.7964098) Skewed
PHARMACY_ID has unique values Unique
DATE_VISIT is an unsupported type, check if it needs cleaning or further analysis Unsupported
NEXT_APPOINTMENT is an unsupported type, check if it needs cleaning or further analysis Unsupported
MORNING has 225607 (35.9%) zeros Zeros
EVENING has 199008 (31.7%) zeros Zeros
BODY_WEIGHT has 625049 (99.6%) zeros Zeros

Reproduction

Analysis started2021-06-15 08:56:22.296203
Analysis finished2021-06-15 08:57:40.913466
Duration1 minute and 18.62 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

State
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.8 MiB
Cross River
627868 

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters6906548
Distinct characters9
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCross River
2nd rowCross River
3rd rowCross River
4th rowCross River
5th rowCross River

Common Values

ValueCountFrequency (%)
Cross River627868
100.0%

Length

2021-06-15T08:57:41.090647image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-15T08:57:41.160659image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
cross627868
50.0%
river627868
50.0%

Most occurring characters

ValueCountFrequency (%)
r1255736
18.2%
s1255736
18.2%
C627868
9.1%
o627868
9.1%
627868
9.1%
R627868
9.1%
i627868
9.1%
v627868
9.1%
e627868
9.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5022944
72.7%
Uppercase Letter1255736
 
18.2%
Space Separator627868
 
9.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r1255736
25.0%
s1255736
25.0%
o627868
12.5%
i627868
12.5%
v627868
12.5%
e627868
12.5%
Uppercase Letter
ValueCountFrequency (%)
C627868
50.0%
R627868
50.0%
Space Separator
ValueCountFrequency (%)
627868
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin6278680
90.9%
Common627868
 
9.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
r1255736
20.0%
s1255736
20.0%
C627868
10.0%
o627868
10.0%
R627868
10.0%
i627868
10.0%
v627868
10.0%
e627868
10.0%
Common
ValueCountFrequency (%)
627868
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII6906548
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r1255736
18.2%
s1255736
18.2%
C627868
9.1%
o627868
9.1%
627868
9.1%
R627868
9.1%
i627868
9.1%
v627868
9.1%
e627868
9.1%

L.G.A
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.8 MiB
Ogoja
367647 
Obudu
143668 
Obanliku
63739 
Yala
52814 

Length

Max length8
Median length5
Mean length5.220433276
Min length4

Characters and Unicode

Total characters3277743
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowObudu
2nd rowObudu
3rd rowObudu
4th rowObudu
5th rowObudu

Common Values

ValueCountFrequency (%)
Ogoja367647
58.6%
Obudu143668
 
22.9%
Obanliku63739
 
10.2%
Yala52814
 
8.4%

Length

2021-06-15T08:57:41.355767image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-15T08:57:41.439163image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
ogoja367647
58.6%
obudu143668
 
22.9%
obanliku63739
 
10.2%
yala52814
 
8.4%

Most occurring characters

ValueCountFrequency (%)
O575054
17.5%
a537014
16.4%
g367647
11.2%
o367647
11.2%
j367647
11.2%
u351075
10.7%
b207407
 
6.3%
d143668
 
4.4%
l116553
 
3.6%
n63739
 
1.9%
Other values (3)180292
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2649875
80.8%
Uppercase Letter627868
 
19.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a537014
20.3%
g367647
13.9%
o367647
13.9%
j367647
13.9%
u351075
13.2%
b207407
 
7.8%
d143668
 
5.4%
l116553
 
4.4%
n63739
 
2.4%
i63739
 
2.4%
Uppercase Letter
ValueCountFrequency (%)
O575054
91.6%
Y52814
 
8.4%

Most occurring scripts

ValueCountFrequency (%)
Latin3277743
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
O575054
17.5%
a537014
16.4%
g367647
11.2%
o367647
11.2%
j367647
11.2%
u351075
10.7%
b207407
 
6.3%
d143668
 
4.4%
l116553
 
3.6%
n63739
 
1.9%
Other values (3)180292
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII3277743
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O575054
17.5%
a537014
16.4%
g367647
11.2%
o367647
11.2%
j367647
11.2%
u351075
10.7%
b207407
 
6.3%
d143668
 
4.4%
l116553
 
3.6%
n63739
 
1.9%
Other values (3)180292
 
5.5%

Facility Name
Categorical

HIGH CORRELATION

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.8 MiB
Ogoja General Hospital
193461 
Ogoja Catholic Maternity Hospital
158799 
Sacred Heart Catholic Hospital
118501 
Obanliku General Hospital
63739 
Yala Lutheran Hospital
30441 
Other values (6)
62927 

Length

Max length35
Median length25
Mean length26.72374607
Min length12

Characters and Unicode

Total characters16778985
Distinct characters34
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowObudu Clinic
2nd rowObudu Clinic
3rd rowObudu Clinic
4th rowObudu Clinic
5th rowObudu Clinic

Common Values

ValueCountFrequency (%)
Ogoja General Hospital193461
30.8%
Ogoja Catholic Maternity Hospital158799
25.3%
Sacred Heart Catholic Hospital118501
18.9%
Obanliku General Hospital63739
 
10.2%
Yala Lutheran Hospital30441
 
4.8%
Obudu Clinic17916
 
2.9%
Ogoja Santa Maria Clinic15387
 
2.5%
Oba Comprehensive Health Centre9632
 
1.5%
Okpoma General Hospital9256
 
1.5%
Obudu Urban1 Primary Health Centre7251
 
1.2%

Length

2021-06-15T08:57:41.661420image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
hospital574197
26.3%
ogoja367647
16.8%
catholic277300
12.7%
general266456
12.2%
maternity158799
 
7.3%
sacred118501
 
5.4%
heart118501
 
5.4%
obanliku63739
 
2.9%
clinic33303
 
1.5%
yala30441
 
1.4%
Other values (12)177110
 
8.1%

Most occurring characters

ValueCountFrequency (%)
a2155254
12.8%
1558126
 
9.3%
t1374160
 
8.2%
l1265804
 
7.5%
o1241517
 
7.4%
i1179881
 
7.0%
e1063094
 
6.3%
r763323
 
4.5%
H713066
 
4.2%
n612346
 
3.6%
Other values (24)4852414
28.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter13027614
77.6%
Uppercase Letter2185994
 
13.0%
Space Separator1558126
 
9.3%
Decimal Number7251
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a2155254
16.5%
t1374160
10.5%
l1265804
9.7%
o1241517
9.5%
i1179881
9.1%
e1063094
8.2%
r763323
 
5.9%
n612346
 
4.7%
p596570
 
4.6%
s587314
 
4.5%
Other values (11)2188351
16.8%
Uppercase Letter
ValueCountFrequency (%)
H713066
32.6%
O475441
21.7%
C344088
15.7%
G266456
 
12.2%
M174186
 
8.0%
S133888
 
6.1%
Y30441
 
1.4%
L30441
 
1.4%
U7251
 
0.3%
P7251
 
0.3%
Space Separator
ValueCountFrequency (%)
1558126
100.0%
Decimal Number
ValueCountFrequency (%)
17251
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin15213608
90.7%
Common1565377
 
9.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a2155254
14.2%
t1374160
 
9.0%
l1265804
 
8.3%
o1241517
 
8.2%
i1179881
 
7.8%
e1063094
 
7.0%
r763323
 
5.0%
H713066
 
4.7%
n612346
 
4.0%
p596570
 
3.9%
Other values (22)4248593
27.9%
Common
ValueCountFrequency (%)
1558126
99.5%
17251
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII16778985
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a2155254
12.8%
1558126
 
9.3%
t1374160
 
8.2%
l1265804
 
7.5%
o1241517
 
7.4%
i1179881
 
7.0%
e1063094
 
6.3%
r763323
 
4.5%
H713066
 
4.2%
n612346
 
3.6%
Other values (24)4852414
28.9%

Regimen Line
Categorical

HIGH CORRELATION

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.8 MiB
ART First Line Adult
469457 
Cotrimoxazole (CTX) Prophylaxis
98270 
ART First Line Children
 
21258
Isoniazid Preventive Therapy (IPT)
 
19569
ART Second Line Adult
 
18451
Other values (8)
 
863

Length

Max length46
Median length20
Mean length22.29019635
Min length4

Characters and Unicode

Total characters13995301
Distinct characters39
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowART First Line Adult
2nd rowART First Line Adult
3rd rowART First Line Adult
4th rowCotrimoxazole (CTX) Prophylaxis
5th rowART First Line Adult

Common Values

ValueCountFrequency (%)
ART First Line Adult469457
74.8%
Cotrimoxazole (CTX) Prophylaxis98270
 
15.7%
ART First Line Children21258
 
3.4%
Isoniazid Preventive Therapy (IPT)19569
 
3.1%
ART Second Line Adult18451
 
2.9%
ART Second Line Children612
 
0.1%
OI Treatment209
 
< 0.1%
Other Medicines22
 
< 0.1%
TB Treatment Adult9
 
< 0.1%
Other anti-infectives (including STI Medicine)5
 
< 0.1%
Other values (3)6
 
< 0.1%

Length

2021-06-15T08:57:41.872228image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
art509778
21.1%
line509778
21.1%
first490715
20.3%
adult487917
20.2%
prophylaxis98273
 
4.1%
cotrimoxazole98270
 
4.1%
ctx98270
 
4.1%
children21871
 
0.9%
therapy19569
 
0.8%
preventive19569
 
0.8%
Other values (16)58719
 
2.4%

Most occurring characters

ValueCountFrequency (%)
1784861
 
12.8%
i1277693
 
9.1%
t1096949
 
7.8%
A997695
 
7.1%
r748518
 
5.3%
e727790
 
5.2%
l706336
 
5.0%
T647420
 
4.6%
s608590
 
4.3%
n590116
 
4.2%
Other values (29)4809333
34.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter8306549
59.4%
Uppercase Letter3668192
26.2%
Space Separator1784861
 
12.8%
Open Punctuation117847
 
0.8%
Close Punctuation117847
 
0.8%
Dash Punctuation5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i1277693
15.4%
t1096949
13.2%
r748518
9.0%
e727790
8.8%
l706336
8.5%
s608590
7.3%
n590116
7.1%
d548452
6.6%
u487925
 
5.9%
o431721
 
5.2%
Other values (11)1082459
13.0%
Uppercase Letter
ValueCountFrequency (%)
A997695
27.2%
T647420
17.6%
R509778
13.9%
L509778
13.9%
F490715
13.4%
C218411
 
6.0%
P137424
 
3.7%
X98270
 
2.7%
I39352
 
1.1%
S19068
 
0.5%
Other values (4)281
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1784861
100.0%
Open Punctuation
ValueCountFrequency (%)
(117847
100.0%
Close Punctuation
ValueCountFrequency (%)
)117847
100.0%
Dash Punctuation
ValueCountFrequency (%)
-5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin11974741
85.6%
Common2020560
 
14.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
i1277693
 
10.7%
t1096949
 
9.2%
A997695
 
8.3%
r748518
 
6.3%
e727790
 
6.1%
l706336
 
5.9%
T647420
 
5.4%
s608590
 
5.1%
n590116
 
4.9%
d548452
 
4.6%
Other values (25)4025182
33.6%
Common
ValueCountFrequency (%)
1784861
88.3%
(117847
 
5.8%
)117847
 
5.8%
-5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII13995301
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1784861
 
12.8%
i1277693
 
9.1%
t1096949
 
7.8%
A997695
 
7.1%
r748518
 
5.3%
e727790
 
5.2%
l706336
 
5.0%
T647420
 
4.6%
s608590
 
4.3%
n590116
 
4.2%
Other values (29)4809333
34.4%

Regimen
Categorical

HIGH CARDINALITY
HIGH CORRELATION

Distinct88
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.8 MiB
TDF(300mg)+3TC(300mg)+DTG(50mg)
206641 
TDF(300mg)+3TC(300mg)+EFV(600mg)
150496 
AZT(300mg)+3TC(150mg)+NVP(200mg)
101110 
Cotrimoxazole 960mg
93214 
Isoniazid 300mg
 
19095
Other values (83)
57312 

Length

Max length62
Median length31
Mean length29.32948008
Min length12

Characters and Unicode

Total characters18415042
Distinct characters56
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rowTDF(300mg)+3TC(300mg)+DTG(50mg)
2nd rowTDF(300mg)+3TC(300mg)+EFV(600mg)
3rd rowTDF(300mg)+3TC(300mg)+DTG(50mg)
4th rowCotrimoxazole 960mg
5th rowTDF(300mg)+3TC(300mg)+EFV(600mg)

Common Values

ValueCountFrequency (%)
TDF(300mg)+3TC(300mg)+DTG(50mg)206641
32.9%
TDF(300mg)+3TC(300mg)+EFV(600mg)150496
24.0%
AZT(300mg)+3TC(150mg)+NVP(200mg)101110
16.1%
Cotrimoxazole 960mg93214
14.8%
Isoniazid 300mg19095
 
3.0%
AZT(10mg/ml)+3TC(10mg/ml)+NVP(10mg/ml)11386
 
1.8%
TDF(300mg)+3TC(150mg)+LPV/r(200/50mg)6349
 
1.0%
Cotrimoxazole 480mg4351
 
0.7%
TDF(300mg)+3TC(150mg)+ATV/r(300/100mg)4296
 
0.7%
TDF(300mg)+3TC(30mg)+DTG(50mg)3367
 
0.5%
Other values (78)27563
 
4.4%

Length

2021-06-15T08:57:42.429292image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
tdf(300mg)+3tc(300mg)+dtg(50mg206641
27.7%
tdf(300mg)+3tc(300mg)+efv(600mg150496
20.2%
azt(300mg)+3tc(150mg)+nvp(200mg101110
13.6%
cotrimoxazole98366
13.2%
960mg93214
12.5%
isoniazid19479
 
2.6%
300mg19095
 
2.6%
azt(10mg/ml)+3tc(10mg/ml)+nvp(10mg/ml11386
 
1.5%
tdf(300mg)+3tc(150mg)+lpv/r(200/50mg6349
 
0.9%
480mg4351
 
0.6%
Other values (91)35457
 
4.8%

Most occurring characters

ValueCountFrequency (%)
02836166
15.4%
m1775684
9.6%
g1639258
 
8.9%
(1520232
 
8.3%
)1520232
 
8.3%
31401695
 
7.6%
T1235348
 
6.7%
+1010449
 
5.5%
C613857
 
3.3%
D592983
 
3.2%
Other values (46)4269138
23.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number5235615
28.4%
Lowercase Letter4715096
25.6%
Uppercase Letter4196819
22.8%
Open Punctuation1520232
 
8.3%
Close Punctuation1520232
 
8.3%
Math Symbol1010449
 
5.5%
Space Separator118076
 
0.6%
Other Punctuation98523
 
0.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
m1775684
37.7%
g1639258
34.8%
o315008
 
6.7%
i137427
 
2.9%
l136815
 
2.9%
r119388
 
2.5%
a118103
 
2.5%
z118061
 
2.5%
e98609
 
2.1%
t98436
 
2.1%
Other values (12)158307
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
T1235348
29.4%
C613857
14.6%
D592983
14.1%
F537658
12.8%
V296700
 
7.1%
G212685
 
5.1%
E155724
 
3.7%
A137570
 
3.3%
P132729
 
3.2%
Z123629
 
2.9%
Other values (8)157936
 
3.8%
Decimal Number
ValueCountFrequency (%)
02836166
54.2%
31401695
26.8%
5347525
 
6.6%
6253126
 
4.8%
1167747
 
3.2%
2123543
 
2.4%
993214
 
1.8%
47551
 
0.1%
84954
 
0.1%
794
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
/98522
> 99.9%
,1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
(1520232
100.0%
Close Punctuation
ValueCountFrequency (%)
)1520232
100.0%
Math Symbol
ValueCountFrequency (%)
+1010449
100.0%
Space Separator
ValueCountFrequency (%)
118076
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common9503127
51.6%
Latin8911915
48.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
m1775684
19.9%
g1639258
18.4%
T1235348
13.9%
C613857
 
6.9%
D592983
 
6.7%
F537658
 
6.0%
o315008
 
3.5%
V296700
 
3.3%
G212685
 
2.4%
E155724
 
1.7%
Other values (30)1537010
17.2%
Common
ValueCountFrequency (%)
02836166
29.8%
(1520232
16.0%
)1520232
16.0%
31401695
14.7%
+1010449
 
10.6%
5347525
 
3.7%
6253126
 
2.7%
1167747
 
1.8%
2123543
 
1.3%
118076
 
1.2%
Other values (6)204336
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII18415042
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
02836166
15.4%
m1775684
9.6%
g1639258
 
8.9%
(1520232
 
8.3%
)1520232
 
8.3%
31401695
 
7.6%
T1235348
 
6.7%
+1010449
 
5.5%
C613857
 
3.3%
D592983
 
3.2%
Other values (46)4269138
23.2%

PHARMACY_ID
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct627868
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1776095.166
Minimum43649
Maximum4082401
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.8 MiB
2021-06-15T08:57:42.564120image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum43649
5-th percentile258862.15
Q1708182.75
median1362669
Q33126253.25
95-th percentile3824185.65
Maximum4082401
Range4038752
Interquartile range (IQR)2418070.5

Descriptive statistics

Standard deviation1255255.564
Coefficient of variation (CV)0.7067501717
Kurtosis-1.314320371
Mean1776095.166
Median Absolute Deviation (MAD)817529
Skewness0.4738560264
Sum1.11515332 × 1012
Variance1.57566653 × 1012
MonotonicityStrictly increasing
2021-06-15T08:57:42.703445image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10547231
 
< 0.1%
28406501
 
< 0.1%
2383691
 
< 0.1%
2445141
 
< 0.1%
12910431
 
< 0.1%
2322281
 
< 0.1%
28003201
 
< 0.1%
2342791
 
< 0.1%
2588591
 
< 0.1%
2506711
 
< 0.1%
Other values (627858)627858
> 99.9%
ValueCountFrequency (%)
436491
< 0.1%
436511
< 0.1%
436551
< 0.1%
436581
< 0.1%
436631
< 0.1%
436651
< 0.1%
436671
< 0.1%
436711
< 0.1%
436731
< 0.1%
436761
< 0.1%
ValueCountFrequency (%)
40824011
< 0.1%
40824001
< 0.1%
40823991
< 0.1%
40823981
< 0.1%
40823971
< 0.1%
40823921
< 0.1%
40823911
< 0.1%
40823901
< 0.1%
40823861
< 0.1%
40823851
< 0.1%

PATIENT_ID
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct18825
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52581.71467
Minimum19095
Maximum160831
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.8 MiB
2021-06-15T08:57:42.845910image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum19095
5-th percentile20738
Q124167
median29628
Q335078
95-th percentile148392
Maximum160831
Range141736
Interquartile range (IQR)10911

Descriptive statistics

Standard deviation47159.52209
Coefficient of variation (CV)0.8968806434
Kurtosis-0.0603474067
Mean52581.71467
Median Absolute Deviation (MAD)5458
Skewness1.341205344
Sum3.301437603 × 1010
Variance2224020523
MonotonicityNot monotonic
2021-06-15T08:57:42.973762image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
148161233
 
< 0.1%
113055231
 
< 0.1%
89158228
 
< 0.1%
112854224
 
< 0.1%
116818216
 
< 0.1%
147777212
 
< 0.1%
112907210
 
< 0.1%
89206202
 
< 0.1%
116999194
 
< 0.1%
112807193
 
< 0.1%
Other values (18815)625725
99.7%
ValueCountFrequency (%)
190954
 
< 0.1%
190981
 
< 0.1%
1910144
< 0.1%
1910478
< 0.1%
191078
 
< 0.1%
1911085
< 0.1%
1911179
< 0.1%
191174
 
< 0.1%
1912050
< 0.1%
19123107
< 0.1%
ValueCountFrequency (%)
1608314
 
< 0.1%
1608084
 
< 0.1%
1608074
 
< 0.1%
1608064
 
< 0.1%
1608054
 
< 0.1%
1608044
 
< 0.1%
16080210
 
< 0.1%
160801106
< 0.1%
1608005
 
< 0.1%
1607995
 
< 0.1%

FACILITY_ID
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1888.15524
Minimum1686
Maximum2935
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.8 MiB
2021-06-15T08:57:43.088092image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1686
5-th percentile1696
Q11752
median1753
Q31753
95-th percentile2929
Maximum2935
Range1249
Interquartile range (IQR)1

Descriptive statistics

Standard deviation390.4319905
Coefficient of variation (CV)0.2067796028
Kurtosis3.22749184
Mean1888.15524
Median Absolute Deviation (MAD)1
Skewness2.272684395
Sum1185512254
Variance152437.1392
MonotonicityNot monotonic
2021-06-15T08:57:43.183433image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
1753193461
30.8%
1752158799
25.3%
1696118501
18.9%
292963739
 
10.2%
183130441
 
4.8%
168617916
 
2.9%
175515387
 
2.5%
29359632
 
1.5%
18179256
 
1.5%
16877251
 
1.2%
ValueCountFrequency (%)
168617916
 
2.9%
16877251
 
1.2%
1696118501
18.9%
1752158799
25.3%
1753193461
30.8%
175515387
 
2.5%
18179256
 
1.5%
183130441
 
4.8%
292963739
 
10.2%
29333485
 
0.6%
ValueCountFrequency (%)
29359632
 
1.5%
29333485
 
0.6%
292963739
 
10.2%
183130441
 
4.8%
18179256
 
1.5%
175515387
 
2.5%
1753193461
30.8%
1752158799
25.3%
1696118501
18.9%
16877251
 
1.2%

DATE_VISIT
Unsupported

REJECTED
UNSUPPORTED

Missing0
Missing (%)0.0%
Memory size4.8 MiB

DURATION
Real number (ℝ)

SKEWED

Distinct145
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean77.84280295
Minimum-30
Maximum90212
Zeros140
Zeros (%)< 0.1%
Negative1
Negative (%)< 0.1%
Memory size4.8 MiB
2021-06-15T08:57:43.303656image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum-30
5-th percentile30
Q160
median60
Q390
95-th percentile180
Maximum90212
Range90242
Interquartile range (IQR)30

Descriptive statistics

Standard deviation124.3629103
Coefficient of variation (CV)1.597616036
Kurtosis439469.154
Mean77.84280295
Median Absolute Deviation (MAD)30
Skewness606.5115674
Sum48875005
Variance15466.13346
MonotonicityNot monotonic
2021-06-15T08:57:43.431660image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
60271500
43.2%
30123588
19.7%
90102713
 
16.4%
18076678
 
12.2%
12026150
 
4.2%
157481
 
1.2%
143859
 
0.6%
563111
 
0.5%
1682643
 
0.4%
841949
 
0.3%
Other values (135)8196
 
1.3%
ValueCountFrequency (%)
-301
 
< 0.1%
0140
< 0.1%
14
 
< 0.1%
235
 
< 0.1%
318
 
< 0.1%
52
 
< 0.1%
626
 
< 0.1%
711
 
< 0.1%
85
 
< 0.1%
99
 
< 0.1%
ValueCountFrequency (%)
902121
 
< 0.1%
18007
< 0.1%
16803
< 0.1%
12201
 
< 0.1%
11801
 
< 0.1%
11201
 
< 0.1%
11121
 
< 0.1%
9607
< 0.1%
9002
 
< 0.1%
6013
< 0.1%

MORNING
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct42
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.767379212
Minimum0
Maximum2401
Zeros225607
Zeros (%)35.9%
Negative0
Negative (%)0.0%
Memory size4.8 MiB
2021-06-15T08:57:43.557108image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile1
Maximum2401
Range2401
Interquartile range (IQR)1

Descriptive statistics

Standard deviation9.062837272
Coefficient of variation (CV)11.81011569
Kurtosis25144.8799
Mean0.767379212
Median Absolute Deviation (MAD)0
Skewness148.0439747
Sum481812.8511
Variance82.13501941
MonotonicityNot monotonic
2021-06-15T08:57:43.685128image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%)
1381250
60.7%
0225607
35.9%
313055
 
2.1%
27798
 
1.2%
1039
 
< 0.1%
1125
 
< 0.1%
120117
 
< 0.1%
6019
 
< 0.1%
1.037
 
< 0.1%
315
 
< 0.1%
Other values (32)56
 
< 0.1%
ValueCountFrequency (%)
0225607
35.9%
0.033
 
< 0.1%
0.061
 
< 0.1%
1381250
60.7%
1.01121
 
< 0.1%
1.0151
 
< 0.1%
1.0181
 
< 0.1%
1.037
 
< 0.1%
1.051
 
< 0.1%
1.0561
 
< 0.1%
ValueCountFrequency (%)
24011
 
< 0.1%
19601
 
< 0.1%
18011
 
< 0.1%
15111
 
< 0.1%
120117
< 0.1%
9604
 
< 0.1%
9014
 
< 0.1%
6032
 
< 0.1%
6019
< 0.1%
6001
 
< 0.1%

AFTERNOON
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.8 MiB
0
627858 
960
 
5
1
 
5

Length

Max length3
Median length1
Mean length1.000015927
Min length1

Characters and Unicode

Total characters627878
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0627858
> 99.9%
9605
 
< 0.1%
15
 
< 0.1%

Length

2021-06-15T08:57:43.968079image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-15T08:57:44.056608image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0627858
> 99.9%
9605
 
< 0.1%
15
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0627863
> 99.9%
15
 
< 0.1%
95
 
< 0.1%
65
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number627878
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0627863
> 99.9%
15
 
< 0.1%
95
 
< 0.1%
65
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common627878
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0627863
> 99.9%
15
 
< 0.1%
95
 
< 0.1%
65
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII627878
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0627863
> 99.9%
15
 
< 0.1%
95
 
< 0.1%
65
 
< 0.1%

EVENING
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.738782037
Minimum0
Maximum960
Zeros199008
Zeros (%)31.7%
Negative0
Negative (%)0.0%
Memory size4.8 MiB
2021-06-15T08:57:44.134364image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile1
Maximum960
Range960
Interquartile range (IQR)1

Descriptive statistics

Standard deviation2.788758845
Coefficient of variation (CV)3.774805971
Kurtosis111518.0424
Mean0.738782037
Median Absolute Deviation (MAD)0
Skewness324.8214677
Sum463857.6
Variance7.777175895
MonotonicityNot monotonic
2021-06-15T08:57:44.237438image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
1412371
65.7%
0199008
31.7%
313048
 
2.1%
23415
 
0.5%
9605
 
< 0.1%
104
 
< 0.1%
1.53
 
< 0.1%
2.53
 
< 0.1%
1002
 
< 0.1%
202
 
< 0.1%
Other values (7)7
 
< 0.1%
ValueCountFrequency (%)
0199008
31.7%
0.11
 
< 0.1%
1412371
65.7%
1.53
 
< 0.1%
23415
 
0.5%
2.53
 
< 0.1%
313048
 
2.1%
3.51
 
< 0.1%
104
 
< 0.1%
111
 
< 0.1%
ValueCountFrequency (%)
9605
< 0.1%
1801
 
< 0.1%
1121
 
< 0.1%
1002
 
< 0.1%
841
 
< 0.1%
301
 
< 0.1%
202
 
< 0.1%
111
 
< 0.1%
104
< 0.1%
3.51
 
< 0.1%

ADR_SCREENED
Boolean

MISSING

Distinct2
Distinct (%)< 0.1%
Missing10322
Missing (%)1.6%
Memory size1.2 MiB
False
615707 
True
 
1839
(Missing)
 
10322
ValueCountFrequency (%)
False615707
98.1%
True1839
 
0.3%
(Missing)10322
 
1.6%
2021-06-15T08:57:44.313506image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

ADR_IDS
Categorical

HIGH CORRELATION
MISSING

Distinct3
Distinct (%)30.0%
Missing627858
Missing (%)> 99.9%
Memory size4.8 MiB
10,1
4,6#1,1
8#1

Length

Max length7
Median length4
Mean length4.6
Min length3

Characters and Unicode

Total characters46
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10,1
2nd row10,1
3rd row4,6#1,1
4th row10,1
5th row10,1

Common Values

ValueCountFrequency (%)
10,14
 
< 0.1%
4,6#1,13
 
< 0.1%
8#13
 
< 0.1%
(Missing)627858
> 99.9%

Length

2021-06-15T08:57:44.512032image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-15T08:57:44.592909image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
10,14
40.0%
8#13
30.0%
4,6#1,13
30.0%

Most occurring characters

ValueCountFrequency (%)
117
37.0%
,10
21.7%
#6
 
13.0%
04
 
8.7%
43
 
6.5%
63
 
6.5%
83
 
6.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number30
65.2%
Other Punctuation16
34.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
117
56.7%
04
 
13.3%
43
 
10.0%
63
 
10.0%
83
 
10.0%
Other Punctuation
ValueCountFrequency (%)
,10
62.5%
#6
37.5%

Most occurring scripts

ValueCountFrequency (%)
Common46
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
117
37.0%
,10
21.7%
#6
 
13.0%
04
 
8.7%
43
 
6.5%
63
 
6.5%
83
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII46
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
117
37.0%
,10
21.7%
#6
 
13.0%
04
 
8.7%
43
 
6.5%
63
 
6.5%
83
 
6.5%

PRESCRIP_ERROR
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.8 MiB
0
617725 
1
 
10143

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters627868
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0617725
98.4%
110143
 
1.6%

Length

2021-06-15T08:57:44.779558image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-15T08:57:44.848017image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0617725
98.4%
110143
 
1.6%

Most occurring characters

ValueCountFrequency (%)
0617725
98.4%
110143
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number627868
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0617725
98.4%
110143
 
1.6%

Most occurring scripts

ValueCountFrequency (%)
Common627868
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0617725
98.4%
110143
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII627868
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0617725
98.4%
110143
 
1.6%

ADHERENCE
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.8 MiB
1
346979 
0
280889 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters627868
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1346979
55.3%
0280889
44.7%

Length

2021-06-15T08:57:45.033462image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-15T08:57:45.101819image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
1346979
55.3%
0280889
44.7%

Most occurring characters

ValueCountFrequency (%)
1346979
55.3%
0280889
44.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number627868
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1346979
55.3%
0280889
44.7%

Most occurring scripts

ValueCountFrequency (%)
Common627868
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1346979
55.3%
0280889
44.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII627868
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1346979
55.3%
0280889
44.7%

NEXT_APPOINTMENT
Unsupported

REJECTED
UNSUPPORTED

Missing6
Missing (%)< 0.1%
Memory size4.8 MiB

DMOC_TYPE
Categorical

HIGH CORRELATION
MISSING

Distinct11
Distinct (%)< 0.1%
Missing460552
Missing (%)73.4%
Memory size4.8 MiB
MMD
76108 
Same Facility Refill
53701 
Individual delivery/home-based
21053 
MMS
14898 
CPARP
 
925
Other values (6)
 
631

Length

Max length51
Median length3
Mean length11.95987831
Min length3

Characters and Unicode

Total characters2001079
Distinct characters37
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMMD
2nd rowMMD
3rd rowMMD
4th rowMMS
5th rowSame Facility Refill

Common Values

ValueCountFrequency (%)
MMD76108
 
12.1%
Same Facility Refill53701
 
8.6%
Individual delivery/home-based21053
 
3.4%
MMS14898
 
2.4%
CPARP925
 
0.1%
Different Facility Refill (Private hospital/clinic)269
 
< 0.1%
CARC199
 
< 0.1%
Group delivery/CARC99
 
< 0.1%
Mobile van/other vehicle40
 
< 0.1%
Fixed or ad hoc pick up points14
 
< 0.1%
(Missing)460552
73.4%

Length

2021-06-15T08:57:45.299075image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
mmd76108
25.6%
facility53970
18.2%
refill53970
18.2%
same53701
18.1%
individual21053
 
7.1%
delivery/home-based21053
 
7.1%
mms14898
 
5.0%
cparp925
 
0.3%
hospital/clinic269
 
0.1%
private269
 
0.1%
Other values (15)894
 
0.3%

Most occurring characters

ValueCountFrequency (%)
i226635
 
11.3%
l204733
 
10.2%
e193072
 
9.6%
M182052
 
9.1%
a150369
 
7.5%
129794
 
6.5%
d84339
 
4.2%
D76377
 
3.8%
y75122
 
3.8%
m74754
 
3.7%
Other values (27)603832
30.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1366003
68.3%
Uppercase Letter462230
 
23.1%
Space Separator129794
 
6.5%
Other Punctuation21461
 
1.1%
Dash Punctuation21053
 
1.1%
Open Punctuation269
 
< 0.1%
Close Punctuation269
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i226635
16.6%
l204733
15.0%
e193072
14.1%
a150369
11.0%
d84339
 
6.2%
y75122
 
5.5%
m74754
 
5.5%
t54841
 
4.0%
c54576
 
4.0%
f54508
 
4.0%
Other values (11)193054
14.1%
Uppercase Letter
ValueCountFrequency (%)
M182052
39.4%
D76377
16.5%
S68599
 
14.8%
R55193
 
11.9%
F53984
 
11.7%
I21053
 
4.6%
P2119
 
0.5%
C1521
 
0.3%
A1223
 
0.3%
G99
 
< 0.1%
Space Separator
ValueCountFrequency (%)
129794
100.0%
Other Punctuation
ValueCountFrequency (%)
/21461
100.0%
Dash Punctuation
ValueCountFrequency (%)
-21053
100.0%
Open Punctuation
ValueCountFrequency (%)
(269
100.0%
Close Punctuation
ValueCountFrequency (%)
)269
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1828233
91.4%
Common172846
 
8.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
i226635
12.4%
l204733
11.2%
e193072
 
10.6%
M182052
 
10.0%
a150369
 
8.2%
d84339
 
4.6%
D76377
 
4.2%
y75122
 
4.1%
m74754
 
4.1%
S68599
 
3.8%
Other values (22)492181
26.9%
Common
ValueCountFrequency (%)
129794
75.1%
/21461
 
12.4%
-21053
 
12.2%
(269
 
0.2%
)269
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII2001079
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i226635
 
11.3%
l204733
 
10.2%
e193072
 
9.6%
M182052
 
9.1%
a150369
 
7.5%
129794
 
6.5%
d84339
 
4.2%
D76377
 
3.8%
y75122
 
3.8%
m74754
 
3.7%
Other values (27)603832
30.2%

BODY_WEIGHT
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct75
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.138822348
Minimum0
Maximum856
Zeros625049
Zeros (%)99.6%
Negative0
Negative (%)0.0%
Memory size4.8 MiB
2021-06-15T08:57:45.414881image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum856
Range856
Interquartile range (IQR)0

Descriptive statistics

Standard deviation3.469496776
Coefficient of variation (CV)24.99235048
Kurtosis31028.80829
Mean0.138822348
Median Absolute Deviation (MAD)0
Skewness138.7964098
Sum87162.11
Variance12.03740788
MonotonicityNot monotonic
2021-06-15T08:57:45.814401image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0625049
99.6%
30254
 
< 0.1%
20244
 
< 0.1%
25216
 
< 0.1%
68137
 
< 0.1%
17109
 
< 0.1%
56104
 
< 0.1%
10101
 
< 0.1%
2199
 
< 0.1%
1599
 
< 0.1%
Other values (65)1456
 
0.2%
ValueCountFrequency (%)
0625049
99.6%
0.253
 
< 0.1%
1.13
 
< 0.1%
1.34
 
< 0.1%
515
 
< 0.1%
69
 
< 0.1%
712
 
< 0.1%
814
 
< 0.1%
8.53
 
< 0.1%
949
 
< 0.1%
ValueCountFrequency (%)
8564
 
< 0.1%
6873
 
< 0.1%
7839
 
< 0.1%
723
 
< 0.1%
716
 
< 0.1%
6911
 
< 0.1%
68137
< 0.1%
6545
 
< 0.1%
6453
 
< 0.1%
633
 
< 0.1%

Interactions

2021-06-15T08:57:24.420738image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:24.682290image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:24.924832image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:25.159552image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:25.392009image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:25.631181image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:25.866063image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:26.094077image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:26.337777image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:26.574323image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:26.799563image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:27.026523image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:27.255688image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:27.486691image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:27.708251image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:27.943676image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:28.167568image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:28.384258image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:28.603454image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:28.823433image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:29.045172image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:29.256939image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:29.488748image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:29.707396image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:29.916328image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:30.130913image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:30.346194image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:30.563230image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:30.777356image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:31.015414image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:31.245822image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:31.465827image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:31.675812image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:31.892103image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:32.108623image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:32.329296image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:32.568719image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:32.798575image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:33.019880image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:33.232525image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:33.444225image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:33.659182image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:33.877120image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:34.114787image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:34.339673image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:34.557887image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:34.770602image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:34.988013image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T08:57:35.197625image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-06-15T08:57:45.932639image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-06-15T08:57:46.113701image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-06-15T08:57:46.294316image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-06-15T08:57:46.489702image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-06-15T08:57:35.683013image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-06-15T08:57:37.148216image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-06-15T08:57:39.590276image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-06-15T08:57:40.261554image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

StateL.G.AFacility NameRegimen LineRegimenPHARMACY_IDPATIENT_IDFACILITY_IDDATE_VISITDURATIONMORNINGAFTERNOONEVENINGADR_SCREENEDADR_IDSPRESCRIP_ERRORADHERENCENEXT_APPOINTMENTDMOC_TYPEBODY_WEIGHT
0Cross RiverObuduObudu ClinicART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)436491975716862019-06-05 00:00:00601.000.0NoNaN012019-08-05 00:00:00NaN0.0
1Cross RiverObuduObudu ClinicART First Line AdultTDF(300mg)+3TC(300mg)+EFV(600mg)436511922816862017-07-18 00:00:00601.001.0NoNaN012017-09-16 00:00:00NaN0.0
2Cross RiverObuduObudu ClinicART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)436551918216862020-04-24 00:00:001800.001.0NoNaN002020-10-24 00:00:00MMD0.0
3Cross RiverObuduObudu ClinicCotrimoxazole (CTX) ProphylaxisCotrimoxazole 960mg436581912316862016-10-19 00:00:00601.000.0NoNaN012016-12-19 00:00:00NaN0.0
4Cross RiverObuduObudu ClinicART First Line AdultTDF(300mg)+3TC(300mg)+EFV(600mg)436631976416862017-05-23 00:00:00301.001.0NoNaN012017-06-21 00:00:00NaN0.0
5Cross RiverObuduObudu ClinicART First Line ChildrenAZT/3TC/NVP(60/30/50mg)436651948716862017-06-23 00:00:001201.001.0NoNaN012017-08-23 00:00:00NaN0.0
6Cross RiverObuduObudu ClinicART First Line AdultTDF(300mg)+3TC(300mg)+EFV(600mg)436671983216862019-02-21 00:00:00601.001.0NoNaN012019-04-21 00:00:00NaN0.0
7Cross RiverObuduObudu ClinicART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)436711975916862019-06-15 00:00:00600.001.0NoNaN012019-08-15 00:00:00NaN0.0
8Cross RiverObuduObudu ClinicART First Line AdultAZT(300mg)+3TC(150mg)+NVP(200mg)436731947516862016-11-21 00:00:00601.001.0NoNaN012017-01-19 00:00:00NaN0.0
9Cross RiverObuduObudu ClinicCotrimoxazole (CTX) ProphylaxisCotrimoxazole 960mg436761993416862020-02-11 00:00:00901.000.0NoNaN002020-05-11 00:00:00MMD0.0

Last rows

StateL.G.AFacility NameRegimen LineRegimenPHARMACY_IDPATIENT_IDFACILITY_IDDATE_VISITDURATIONMORNINGAFTERNOONEVENINGADR_SCREENEDADR_IDSPRESCRIP_ERRORADHERENCENEXT_APPOINTMENTDMOC_TYPEBODY_WEIGHT
627858Cross RiverOgojaOgoja Catholic Maternity HospitalART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)40823852459717522021-05-26 00:00:001801.000.0NoNaN002021-11-29 00:00:00Same Facility Refill0.0
627859Cross RiverOgojaOgoja Catholic Maternity HospitalART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)408238611317417522021-05-25 00:00:001800.001.0NoNaN002021-11-23 00:00:00Same Facility Refill0.0
627860Cross RiverOgojaOgoja Catholic Maternity HospitalART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)408239015484917522021-05-25 00:00:00900.001.0NoNaN002021-08-27 00:00:00Same Facility Refill0.0
627861Cross RiverOgojaOgoja Catholic Maternity HospitalART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)40823912391017522021-05-25 00:00:001800.001.0NoNaN002021-11-16 00:00:00Same Facility Refill0.0
627862Cross RiverOgojaOgoja Catholic Maternity HospitalART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)40823922649517522021-05-26 00:00:001800.001.0NoNaN002021-11-29 00:00:00Same Facility Refill0.0
627863Cross RiverOgojaOgoja Catholic Maternity HospitalART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)40823972751517522021-05-25 00:00:001801.000.0NoNaN002021-11-29 00:00:00Same Facility Refill0.0
627864Cross RiverOgojaOgoja Catholic Maternity HospitalART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)40823982462817522021-05-27 00:00:00901.000.0NoNaN002021-08-30 00:00:00Same Facility Refill0.0
627865Cross RiverOgojaOgoja Catholic Maternity HospitalART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)40823992751717522021-05-25 00:00:00900.001.0NoNaN002021-08-30 00:00:00Same Facility Refill0.0
627866Cross RiverOgojaOgoja Catholic Maternity HospitalART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)40824002550217522021-05-26 00:00:001801.000.0NoNaN002021-11-29 00:00:00Same Facility Refill0.0
627867Cross RiverOgojaOgoja Catholic Maternity HospitalART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)40824012337717522021-05-26 00:00:00900.001.0NoNaN002021-08-30 00:00:00Same Facility Refill0.0